perm filename CONCLU[0,BGB]14 blob
sn#115973 filedate 1974-08-19 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00010 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 {⊂C<NαRESULTS AND CONCLUSIONS.λ30P100I425,0JCFA} SECTION 10.
C00006 00003 As a system design, the present work can be compared with
C00010 00004
C00014 00005 ⊂10.2 Critique: Errors and Ommissions.⊃
C00019 00006 ⊂10.3 Suggestions for Future Work.⊃
C00022 00007 Future development of <Combination Geometric Models> may
C00027 00008
C00031 00009
C00034 00010
C00041 ENDMK
C⊗;
{⊂C;<N;αRESULTS AND CONCLUSIONS.;λ30;P100;I425,0;JCFA} SECTION 10.
{JCFD} RESULTS AND CONCLUSIONS.
{λ10;W250;JAFA}
10.1 Results: Accomplishments and Original Contributions.
10.2 Critique: Errors and Ommissions.
10.3 Suggestions for Future Work.
10.4 Conclusion.
{λ30;W0;I800,0;JUFA}
⊂10.1 Results: Accomplishments and Original Contributions.⊃
As a regular feature in a Ph.D. dessertation, it is required
to explicitly state what has been accomplished and what is original.
Some of what has been accomplished is itemized in box 10.1; with the
so called <original contributions> marked by asterisks. Each of the
accomplishments has been elaborated in the indicated chapter.
{|;λ10;T150,165,900;JA;FA}
BOX 10.1{JC} ACCOMPLISHMENTS AND ORIGINAL CONTRIBUTIONS.
0. The Geometric Feedback Vision Theory chapter 6.
* 1. The Winged Edge Polyhedron Representation chapter 2.
* 2. The Euler Primitives for Polyhedron Construction chapter 3.
3. The Iron Triangle Camera Locus Algorithm chapter 9.
* 4. The OCCULT hidden line elimination algorithm chapter 4.
* 5. The Polygon Nesting Algorithm chapter 7.
* 6. The Polygon Dekinking Method chapter 7.
7. The Polygon Segmenting Method chapter 7.
8. The Polygon Comparing Method chapter 8.
* 9. Silhouette Cone Intersection chapters 5 and 9.
{|;T-1;λ30;JUFA}
As a whole, the system described in this thesis is the third
of its kinds, succeeding the systems of Roberts (1963) and Falk
(1970). Although, the modeling routines of the present system are
considerably more sophisticated than were those of its predecessors;
improvement in the visual analysis routines is less dramatic and more
open to question. The present image analysis differs from the early
systems in that emphasis is placed on the use of multiple
images for the sake of parallax depth perception and in that several
spatially coherent image representations are combined (contour image,
mosaic image and raster image) to preserve the structure of the scene
through feature extraction rather than following the earlier design
paradigm of extracting features from the image piecemeal and
attempting to splice them together afterwards.
As a system design, the present work can be compared with
earlier works by comparing the block diagrams, the charcteristically
circular mandalas of feedback vision. Mandala like diagrams
appear in (Newell), (Falk) Figure 4-7, page 78; (Grape) Figure 12.1,
page 242; (Tenenbaum) Figure 1.13, page 43; as well as in this work
Figure 6.1, page 70. The feedback mandala is conspicuously absent in
the best of the stimulus-response visual parsing work, (Waltz), as
well as in statistical recognition work, (Duda and Hart).
The important ideas depicted in the feedback vision mandala are
the duality of the simulated and physical worlds, the duality of
the description and verification,
the duality of camera and body locus
solving, the dual opposing flows of predicted and perceived images
along a hieracry of commensurate abstractions. Newell's general
schema of a problem solver (embellished with disembodied eyeballs)
has the two worlds dichotomy, but lacks the compare steps.
Tenenbaum's figure, as well as his thesis as a whole,
illustrates the basic feedback loop in the immediate vicinity of the
visual sensor. The diagrams of Falk and Grape are similar
mirrors of the overall system design of the Stanford Hand/Eye group
(1969 to 1973) under the leadership of Professor Feldman. The
two diagrams depict an array of relevant boxes (camera solver,
edge finder, world modeler and so on) all sending messages to each
other under the benign direction of a box labeled "general
strategist".
Among the elements composing the GEOMED/CRE system, the most
original accomplishment is the winged edge polyhedron representation.
In computer graphics models are based on face perimeter lists
(or arrays), with an awareness that more topological relations exist
but with no realization that a substantial improvement in surface
topology modeling is feasible with aprroximately the same memory and
computational resources.
The idea for the Euler primitives was based on a constructive
proof of the Euler relation found in (Coxeter 61), which combined
with a fondness for sweep operators resulted in the Euler
primitives. Comparison with other work is difficult since, other
graphics models lack a level of abstraction falling between the level
of node/link operations and operations with solids. The Euler
primitives were a blessing in implementing OCCULT and GEOMED sweep
and glue operations; the Euler primitives were a deceptive curse in
implementing the body intersector, BIN.
A pre-computer form of the Iron Triangle camera solving
method appears in a paper by Berkay (59). Berkay described the method
as an analog procedure to be perform with paper, ruler and afew other
photogrammetric hand tools. (The existence of this paper was pointed
out to me by Irwin Sobel).
The original accomplishment of the hidden line eliminator,
OCCULT lies in its unification of methods and in its exploitation of
object and image coherence made feasible by the Euler primitives and
the Winged Edge Representation.
The last five accomplishments listed in box 10.1 are related
to vision. The nesting and dekinking problems have been stated and
solved by others, the present solutions are original only in
technical detail the nesting for its use of memory to avoid a
combinatorial number of compares and the dekinking in its achievement
of good results with almost no effort. The recursive polygon
segmentation idea and the polygon compare idea have been in the
vision and graphics oral tradition for as long as I have (since
1967); although I have found no references for the methods.
⊂10.2 Critique: Errors and Ommissions.⊃
The major weakness in the existing modeling system is that it
lacks overall unity - the modeling and image anaylsis are not yet
sufficiently integrated. The second major weakness is that the
essential subsystems involving comparing, locus solving and
recognition are still in a primitive condition. Consequently, an
unambiguous objective demonstation of the relevance of 3-D modeling
to computer vision is missing; the particular demonstration which I
had in mind was to have a robot vehicle drive outside around the
laboratory visually servoing along a trajectory given in advance.
{λ9;|;JA}
Box 10.2 {λ9;JAJC} ITEMS WHICH SHOULD HAVE BEEN DONE YESTERDAY.
1. System unity: Image Anaylsis with 3-D Geometric Models.
2. Image Compare Problems.
3. Locus Solving Problems.
4. 3-D Geometric Recognition.
{|λ30;JUFA}
In the course of this work, technical failures have included
the attempt to use Euler primitive to implement body intersection,
the attempt to bundle contour images into mosiac images, as well as
attempts to make the Euler kill primitives logically air tight
without time consuming model checking. The worst system design errors
are of the form of misallocated effort. More time might have been
spent on image analysis programming and less time on image synthesis
work and so forth. The research suffers from having no objective
criterion for deciding which of several possibilities deserves the
most immediate effort.
A final great barrier to progress in computer vision is the
inadequacy of the hardware. It may be true that "It is a poor workman
who blames his tools"; but for me the greatest single source of
personal frustration has been the television cameras and robotics
hardware: cart and turntable. At Stanford these devices have not been
implemented or maintained with sufficient care to make them
convenient to use.
{Q}
⊂10.3 Suggestions for Future Work.⊃
The application of geometric modeling to vision and robotics
raises numerous interesting ideas and problems, box 10.3.
{|λ9;JA}
Box 10.3 {λ9;JAJC} SUGGESTIONS FOR FUTURE WORK.
SPATIAL MODELING WORK.
0. Combination Geometric Models - Converters.
1. Cellular Space Modeling - Tetrahedral Simplices.
2. Spatial Simulation: Collision Avoidance Problem.
3. Higher Dimensionality, 4-D GEOMED.
SIMULATIONS.
4. Mechanical Simulation.
5. Creature Simulations.
6. Geometric Task Planning.
7. Geometric/Semantics Modeling.
MATHEMATICALLY ORIENTED PROBLEMS.
8. The Manifold Resurfacing Problem.
9. The Curved Patchs Problem.
10. Prove the Correctness of a Hidden Line Eliminator.
GET RICH QUICK APPLICATIONS.
11. Automatic Machine Shop.
12. Animation for Entertainment Industry.
SYSTEMS SOFTWARE AND VISION HARDWARE WORK.
13. Better Loader and/or Incremental Assembler.
14. Better Cameras.
15. Image Oriented Number Crunching Computer Hardware.
16. Better Robot Vehicles.
{|λ30;JUFA}
Future development of <Combination Geometric Models> may
begin by writing converters between geometric representations. For
example, there is a need to convert polyhedra into spine cross
sections, space points into polyhedra, contour maps into faceted
surfaces and so on. Extramural combination models include
<Geometric/Semantic Modeling> which will be needed to cover the gulf
between Minsky's (1974) notion of a visual frame-system (e.g.
expectation of a room) and a geometric prediction of the features to
be found in the image. Although the Minsky Frame-System theory does
not explicitly reveal the crucial interface between numerical
geometric modeling and symbolic abstractions, that nexus is a central
part of the frame-system idea.
The <Cellular Space Modeling> idea is that both space and
objects should be modeled using a space filling tesselation of cells;
perhaps using the tetrahedral 3-simplex. The difficultly is in
getting the Euclidean primitives to correctly update the geometry and
topology of empty space as an object moves and rotates. The rewards
might an include an elegant approach to collision avoidance problems
in vehicle navigation and arm trajectory planning. Other approaches
to <spatail simulation> and <collision avoidance problems> that might
be pursued is the use of simulated viewpoint to see obstacle free
trajectories by means of hidden line elimination, this method is
suggest in (Sutherland 69).
In several recent Stanford dissertations, (Falk, Yakimofsky,
Grape, and so on.) the authors conclude with the prediction that
their essentially 2-D techniques can readily be extended to 3-D in
future work. In my turn, I seriously wish to propose that my
essentially 3-D techniques can be extended to 4-D. The resulting
models could be applied to Regge Calculus for computing the general
relativistic geometric models of such systems as two or three
colliding blackholes or on a less cosmic level a 4-D Geomed could be
of service for planning sequences of arm manipulations viewing time
as a spatial dimension. Collision of 3-D polyhdera moving in
time can be discribed as a static intersection of 4-D polytopes.
Geometric modeling is also applicable to future work in
simulation. <Mechanical Simulation> involves computing the Newtonian
mechanics of everyday objects, problems which are immediately
approachible from a GEOMED foundation include simulated object
collision, statics, and pseudo friction. For example, consider what
is needed to predict the out come of setting one more block at a
given place on an existing tower or of throwing one block into a
tower of other blocks. <Geometric Task Planning> problems include the
old A.I. favorite block stacking as well as the newer
research problems related to industrial assembly. Existing solutions
to geometric tasks are notoriously restricted, for example I know of
no blocks stacking program that handles arbitrary rotations - all
blocks to date are piled on the square.
Although, it has been recognized early and often that
numerical control of machine tools should be automated, the actual
implementation of a system that builds artifacts directly from a
geometric modeling program still lies in the future. As a start,
someone at any of the research labs with an general purpose
manipulator could begin by carving models out of soap or other soft
material with a rotating cutting tool.
Advanced mechanical simulations as well as <Animation for
Entertainment> quickly run into the problem of <Creature Simulation>
- given a multilegged bug, what control program is required to make
the bug walk through a building. Barring the darkness of war, the
greatest potential future use of robotic simulation will be required
not by governments, universities, or manufacturing industries but
rather by the entertainment industry. When it becomes economically
feasible to generate motion pictures and television programs by
computer graphics, great progress will be made in simulating visual
reality and in representing mundane situations in a computer.
Theoretical work in geometric modeling will continue to
pursue curved representations. Two problems that I would especially
like to see solved involve fitting curved surfaces to form a smooth
object, (a manifold), as well as resurfacing an existing manifold
representation. Both problems involve more segmentation than
smoothing. It is easy to fit functions to facial patchs of an object,
it is hard to subdivide an object into the proper set of patchs. The
one geometric algorithm that seems most ripe for future quantative
study and logical analysis is the hidden line elimination process.
Finally progress in computer vision and geometric modeling
requires progress in systems software and computer systems. In my
opinion, recent university based research in programming languages is
over concentrated in very high level language theory and automatic
programming. Future language and systems work should include
developing an incremental loader/assembler/debugger/editor that can
handle algebraic expressions, block structure, node/link storage
notation as well as unvarnished machine instructions. Although
special purpose image processing hardware has earned a bad reputation
(starting with the Illiac-III); in my opinion a real vision system
will be composed of a large array of computer like elements (4096 by
4096) that pipeline a stream of images into structured image
representations. The perceived images are then compared with
predicted images and a detailed 3-D model is altered or constructed
in real time (24 images per second) using a small number of computers
(32 or less) which by the standards of our day (1974) would be very
large and very fast (ten megawords of main memory and ten megahertz
instruction execution).
Assuming the continuation of civilization with a growing
technology over the next one hundred to a thousand years.
developments in Computer Vision and Artificial Intellegence could
lead to robots, androids and cyborgs which will be able to see, to
think and to feel conscious. The utility of building (or becoming)
such entities is that as an android one would be smarter, more
sensitive and would live longer - one could live long enough to
explore the galaxy.
⊂10.4 Conclusions.⊃
The particular technical conclusions of this work include the
methods, system designs and data structures for geometric modeling
which have already been elaborated. Based on the details, one could
make such generalized observations as that: recursive windowing is a
good technique for spatial sorting, simple geometric representations
fall into space oriented and object oriented classes, the essence of
an object representation is its coherence under various operators and
that the power of a vision system might be enhanced by application of
3-D modeling techniques. However in closing, I would like to draw
three rather more general conclusions, conclusions which by contrast
to the technical ones might be constued as scientific conclusions.
1. ~<The Nature of Perception>~. Perception is essential to
intelligence as it is the process which converts external sensations
into internal thoughts. There are two kinds of simple perception
systems: Stimulus-Response and Prediction-Correction Feedback;
together S-R. and P-C.F. can be formed into a compound perception
system.
2. ~<The Necessity to Experiment>~. Robotic hardware is
essential to Artificial Intelligence as an experimental science. It
is misleading to study only theoretical robotics of plausible
abstractions, mathematics, puzzles, games and simulations. The real
physical world is the best test of adaptive general intelligence. The
complexity and subtlety of real world situations, even of a situation
as seemingly finite as a digital television picture, can not be
anticipated from a philosopher's armchair or from a programmer's
console.
3. ~<The Necessity to Simulate Visual Reality>~. Modeling is
essential to prediction-correction feedback perception. Although
simulated robot environments should not be used in place of the
external physical reality, such environmental simulations are an
essential part of a robot's internal mental reality. In the
particular case of vision, geometric models should be easy to adapt
to the basic mental abilities of present day computer hardware. To
conclude, perception requires two worlds one that is the external
physical reality and the other which is the internal mental reality.